Nlp-based content recommender

ABSTRACT

Methods, techniques, and systems for using natural language processing to recommend related content to an associated text segment or document. Example embodiments provide a NLP-based content recommender (“NCR”) which uses NLP-based search techniques, potentially in conjunction with context or other related information, to locate and provide content related to entities that are recognized in the associated material. NCRs may be embedded as widgets, for example on Web pages to assist users in their perusal and search for information, provided by means of browser plug-ins or other application plug-ins, provided in libraries or in standalone environments, or otherwise integrated into other code, programs, or devices. This abstract is provided to comply with rules requiring an abstract, and it is submitted with the intention that it will not be used to interpret or limit the scope or meaning of the claims.

TECHNICAL FIELD

The present disclosure relates to methods, techniques, and systems forpresenting content using natural language processing and, in particular,to methods, techniques, and systems for recognizing named entities usingnatural language processing and presenting content related thereto.

BACKGROUND

With more than 15 billion documents on the World Wide Web (the Web)today, it has become very difficult for users to find desiredinformation or to discover relevant information. Typically, a userengages a keyword (Boolean) based search engine to enter terms that s/hethinks relates to the topic of interest. Unfortunately, there could behundreds of thousands of documents with similar keywords requiringreaders to sort out what is relevant. Moreover, once a user has followedlinks (e.g., hyperlinks, hypertext, indicators, etc.) to more than a fewweb pages, it is highly likely that the user has navigated to a pointthat makes it difficult to retrace steps.

Thus, although the volume of documents on the Web potentially makes alot more information available to the average person, it takes a fairbit of time to actually find documents that are useful.

BRIEF DESCRIPTION OF THE DRAWINGS

The patent or application file contains at least one drawing executed incolor. Copies of this patent or patent application publication withcolor drawings will be provided by the Office upon request and paymentof the necessary fee.

FIG. 1A is an example screen display of an example mechanism forinvoking a NLP-Based Content Recommender from a web page displayed in aweb browser.

FIG. 1B is an example screen display of an example NLP-Based ContentRecommender presented to recommend content relating to underlying text.

FIG. 1C is an example screen display illustrating the result ofselection of one of the named entities in the underlying content.

FIG. 1D is an example screen display of example refinements ofrecommendations of an example NLP-Based Content Recommender based uponselection of node in a connections map.

FIG. 1E is an example screen display of an NLP-Based Content Recommenderplaying a selected video.

FIG. 2 is an example screen display of another type of NLP-Based

Content Recommender widget presented adjacent to content.

FIGS. 3A-3E are example screen displays of a named entity profilepresented by an example embodiment of an NLP-Based Content Recommender.

FIGS. 4A-4C are example screen displays of another type of NLP-BasedContent Recommender widget presented adjacent to content.

FIG. 5 is an example code for installing an example embodiment of anNLP-Based Content Recommender in a content creator's Web page.

FIGS. 6A-6D illustrate example screen displays for an example embodimentof an NLP-Based Content Recommender in the form of links to furtherinformation.

FIGS. 7A and 7B are example screen displays that illustrate use of thewidgets shown in FIGS. 1A-1D integrated into an application.

FIGS. 8A and 8B illustrate example screen displays for an exampleembodiment of an NLP-Based Content Recommender in the form of graphicallinks that can be used to navigate to further information.

FIG. 9 is another example screen display of a graphical representationof connections.

FIGS. 10A and 10B illustrate another interface for presenting relatedcontent to an underlying named entity.

FIGS. 11A-11C illustrate another example NCR widget that combines someof the previously described textual and graphical presentations topresent related and/or auxiliary information.

FIGS. 12A-12C illustrates another example NCR widget integrated into awebsite that provides links to news and blog information.

FIG. 13 is an example block diagram of an example computing system thatmay be used to practice embodiments of a NLP-Based Content Recommender.

DETAILED DESCRIPTION

Embodiments described herein provide enhanced computer- andnetwork-assisted methods, techniques, and systems for using naturallanguage processing techniques, potentially in conjunction with contextor other related information, to locate and provide content related toentities that are recognized in associated material. Example embodimentsprovide one or more NLP-based content recommenders (“NCRs”) that each,based upon a natural language analysis of an underlying text segment,determine which entities are being referred to in the text segment andrecommend additional content relating to such entities.

NCRs may be useful in environments such as to support a user browsingpages of content on the Web. One or more NCRs may be embedded as widgetson such pages to assist users in their perusal and search forinformation, provided by means of browser plug-ins or other applicationplug-ins, provided in libraries or in standalone environments, orotherwise integrated into other code, programs, or devices.

For example, when a news article is being displayed in a Web browser, anNCR may be invoked to suggest additional relevant content by recognizingthe entities referred to in the article and determining relevantadditional content, organized by a number of factors, for example, byfrequency of appearance of other information relating to one of therecognized entities in the article, by knowledge of the browse patternsof the reader, etc. An NCR might also be invoked to allow the reader toexplore the top entities “connected” to one of the entities selectedfrom the entities recognized in the news article. Connectedness in thissense refers to entities which are related to the selected recognizedentity typically through one or more actions (verbs). Or an NCR might beinvoked to “filter” or otherwise rank or order the content presented tothe user.

FIG. 1A is an example screen display of an example mechanism forinvoking a NLP-Based Content Recommender from a web page displayed in aweb browser. In FIG. 1A, web browser 100 is shown displaying newsarticle 104. An icon 150 labeled “Evri” is display for invoking the NCR.

FIG. 1B is an example screen display of an example NCR presented torecommend content relating to underlying text. The news article web page104 is shown presented using web browser 100 as described with referenceto FIG. 1A. An example NLP based content recommender 101 is displayed asa pop-up window 101 accessible from an icon 150 in FIG. 1A. The exampleembodiment of NCR 101 shows an section 103 of “Top Related Articles” anda filter section 102 of focus terms that may be used to filter the toprelated articles shown in section 103.

In at least some embodiments, the NCR may use context informationrelating to source information that was used to establish and identifythe entities (e.g., verbs, related entities, entities within closeproximity in the underlying text or in other text, or other clues) inthe recommendations. In some embodiments, algorithms are employed fornatural language-based entity recognition and disambiguation todetermine which entities are present in the underlying text. Forexample, these algorithms may be incorporated to display an ordered listof all, or the most important, or the top “n” entities present on a Webpage in conjunction with the underlying page. The items on the list canthen be used to navigate to additional (related) content, for example,as “links” or other references to the content. The example NCRillustrated in FIGS. 1B performs extensive NLP-based searching andprocessing in the background to identify the entities in the underlyingarticle 104 and then to find and order the top related articles that aredisplayed in section 103.

An example system that supports the generation of an ordered list ofentities is described in co-pending U.S. patent application Ser. No.12/288,158 titled “NLP-Based Entity Recognition and Disambiguation,”which is incorporated by reference it its entirety. In addition, in atleast some embodiments, an NLP-Based search mechanism can beincorporated by an NCR to find related (e.g., auxiliary or supplemental)information to recommend. Contextual and other information, such asinformation from ontology knowledge base lookups or from other knowledgerepositories may also be incorporated in establishing information torecommend. One such system and methods for generating related contentusing relationship searching is encompassed in the InFact® relationshipsearch technology (now the Evri relationship search technology),described in more detail in U.S. patent application Ser. No. 11/012,089,filed Dec. 13, 2004, which published on Dec. 1, 2005 as U.S. PatentPublication No. 2005/0267871A1, and which is hereby incorporated byreference in its entirety. In this system, NLP-based processing is usedto locate entities and the connections (relationships) between thembased upon actions that link a source entity to a target entity, or visaversa (i.e., queries that specify a subject and/or an object, and zeroor more verbs that may relate them).

In addition, the InFact®/Evri technology provides a query languagecalled “IQL” (now “EQL”) and a navigation tip system with querytemplates for generating relationship queries with or without agraphical user interface. Query templates and the navigation tip systemmay be incorporated by other code to automatically generate generalizedsearches of content that utilize sophisticated linguistics and/orknowledge-based analysis. The InFact®/Evri tip system not only performsthe NLP-based search, but can order the results as desired. In addition,the tip system can dynamically evolve the searches hence the relatedentities as the underlying text is changed, for example by filtering itusing focus terms 102 in FIG. 1B. Additional information on theInFact®/Evri navigation tip system is found in U.S. patent applicationSer. No. 11/601,602, filed Nov. 16, 2006, which published on Jul. 5,2007 as U.S. Patent Publication No. 2007/0156669A1, and in U.S. patentapplication Ser .No. 12/049,184, filed Mar. 14, 2008, which are hereinincorporated by reference in their entireties. Other and or differentNLP-based processing may similarly be incorporated by exampleembodiments of an NCR.

In at least some embodiments, NCRs are provided by means of a userinterface control displayed adjacent to, approximate to, on or nearother displayed content such as illustrated in FIG. 1B. Such aninterface control can be implemented in the form of a “widget” (e.g., acode module, excerpt, script, etc.), which can be made available tothird parties and other content providers to associate with content theycontrol. In addition, a user or other widget consumer (such as a contentcreator or distributor) can download a widget provided via a URI or URL(uniform resource identifier or locator), web portal, server, etc. Forexample, a content creator may download an NCR widget for installing itas a plug-in in the creator's blogging platform. The NCR widget may haveone or more associated representations, i.e., icons, images, orgraphical symbols, which may take many different forms, and which can bedisplayed on a display screen and used to invoke the functionality ofthe widget. In some embodiments, customizations, such as different UIrenderings, color schemes, capabilities, etc. may also be available whenthe widget is installed. Also, in some embodiments, NCR widget end users(those using the widgets to display related content) may also beprovided with customizations.

FIG. 5 is an example code for installing an example embodiment of an NCRin a content creator's Web page. In particular, the script 501 may beintegrated to provide a pop-up window NCR widget, such as thatillustrated in FIG. 1B. In this example, the script 501 and installationnotes 502 are provided on a Web page controlled by the widget created.Although the particular script 501 is written in html (which includesJavaScript), appropriate other scripting languages (e.g., Ruby, Perl,and Python) can be used in other environments to include an NCR widget.For example, a VisualBasic script may be used to provide a similar NCRwidget in a Microsoft Office environment.

Such widgets need not be limited to displaying related contentaccessible via a Web browser. Indeed, NCR widgets also may be useful ina variety of other contexts and platforms, such as to create othermechanisms for finding sought after data in large repositories ofinformation (e.g., corporate intelligence data bases, productinformation, etc.), to perform research or other discovery, to providelearning tools in educational environments, to navigate newsletters andarchived articles for a company, etc. NCRs are intended to aid inconveying meaningful information to end users from among a morass ofdata without them necessarily knowing how to search for thatinformation. They are intended to do a better job at emulating“understanding” the underlying text than a keyword search engine would,so that users can search less and understand more, or discover more withless work.

NCR widgets present user interfaces that may vary depending upon thecontext in which they are integrated, their use, etc. FIGS. 1B-13illustrate several different example embodiments of forms for suchwidgets that contain content summary information, and controls fornavigating to related or other contextually-significant information. Inat least some embodiments, an indicator (such as a hypertext link, orhyperlink) is displayed proximate to a respective entity if more orrecommended related information is available. Also, in some embodimentsuser interface controls are provided to navigate to and among thevarious supplemental information. For example, one or more indicatorsfor navigating to the supplemental information may be presented. Theseindicators may be presented in the form of links, graphical symbols,icons, shapes, logos, trademarks, or the like. Many variations forpresenting widgets/user interface controls are possible, and the onespresented in FIGS. 1B-13 are merely illustrative and not intended to beexhaustive.

In at least some of the NCRs, the name of entity (e.g., Barack Obama) isprovided along with an indication of the type of entity and/or its roles(e.g., categories or facets, such as senator, democrat, presidentialcandidate). Then, for some NCRs, a list of facts about the entity and/oran overview of further content is displayed. In at least someembodiments, an image associated with the named entity is alsodisplayed. Importantly, if more information (as determined by the NCR)is available, then a link (also referred to as a hyperlink, hypertext,or other indicator) may be displayed. The link may be operated (e.g.,selected or navigated to) by a user to navigate to recommended content.Other features, including more or different features may be provided orcombined in an embodiment of an NCR as helpful in the context.

For example, as described earlier, the example NCR 101 in FIG. 1B isprovided in a pop-window on top of underlying news article content 104.The “Focus On” list in filter section 102 is created using the naturallanguage processing methods described above. In particular, section 102lists the “most important” named entities found in the underlyingcontent as determined by NLP-based relationship searching (such as thatprovided by the InFact®/Evri relationship search technology). Differentdefinitions of “most important” may be used in the NCR, including butnot limited to frequency of use in the article, popularity among a setof documents searched, etc.

FIG. 1C is an example screen display illustrating the result ofselection of one of the named entities in the underlying content. Inparticular, when the user selects, from the filter section 102, the link“Barack Obama” 105, which is a named entity found in the underlyingcontent, the top related articles section 108 changes to reflect newrecommendations. In at least some embodiments, the NCR executes anatural language based relationship query, such as an EQL query, in thebackground against some body of documents. The resulting information canbe used to populate various fields in the user interface of the NCR andto find and suggest the recommended content that is displayed to the enduser when, for example, the user navigates to such content via adisplayed link. Accordingly, the related articles section 109 shows theresult of executing a query of Barack Obama relating (in one or moreways described by actions/verbs) to one or more of the named entities inthe underlying content (the news article 104).

The illustrated NCR 110 also includes a “Connections” section 106, whichprovides a graphical map of the entities related to the selected namedentity 105. The entities included in the graphical map 106 may beselected by the NCR 110 as the most popular entities, the mostfrequently described in the top related articles, or using other rules.In one embodiment, as shown, the entities in the connections map 106 arecolor-coded based upon their base type: for example, whether they arepersons, places, or things (which may include organizations, products,etc.). An end user may select one of the nodes 107 on the map 106, tofurther change the recommendations by refining what is considered“related.”

FIG. 1D is an example screen display of example refinements ofrecommendations of an example NCR based upon selection of node in aconnections map. In FIG. 1D, the user has selected the node “Ohio” 106in the illustrated NCR 120, which has caused the NCR to change itsbackground searches to focus the recommendations on articles in which“Barack Obama” is connected (related by action/verb) to then entity“Ohio.” This changed focus is reflected in field 122. The articles nowdisplayed in the recommended articles section 121 reflect the toparticles that describe something about Baracks interactions with Ohio.Full profiles (descriptions of useful, related information) areobtainable by selection of the links for the named entities in therelated articles section 121; that is for the entities 124.

Example NCRs also may include still and or video images. By selectinglink 123, the user can navigate to recommended videos that relate to therelationship between “Barack Obama” and “Ohio.” Note that theserecommendations may also be ordered and/or ranked. FIG. 1E is an examplescreen display of an NCR playing a selected video. A video 132 of BarackObama is played in response to user selection of the video link 122.Images, when available, may be displayed similarly.

Note that FIGS. 1B-1E provide an example of one type of NCR. Many otherexample, including ones with very different appearing user interfaces,may be implemented.

FIG. 2 is an example screen display of another type of NCR widgetpresented adjacent to content. The NCR 201 is provided next to newsarticle 200 and comprises and entity information section 202, a relatedarticles section 203, and a connections map 205. The related articlessection 203 and connections map 205 operate similarly to those describedwith reference to FIGS. 1B-1E. As is observable, in this particular NCR201, persons (e.g., Jennifer Brunner node 208) are color coded in green,places (e.g., Ohio 207) in blue, and things—organizations (e.g., socialsecurity administration 206) in red. The entity information section 202includes named entities from the article 200, ordered. In the embodimentshown, they are ordered in importance. Other orderings can be similarlyincorporated. The NCR 201 also displays a link 204 to the profile(description of the named entity) of the most relevant named entity“Jennifer Brunner.”

FIGS. 3A-3E are example screen displays of a named entity profilepresented by an example embodiment of an NCR. As can be observed, theuser interface and controls are different than those provided in FIGS.1B and 2; however, many of the same capabilities of an NCR are present.In particular, the example NCR of FIG. 3A provides a connection map 301and a top articles section 303 that recommends the “top” articlesrelating to the named entity “Jennifer Brunner.” Again, these articlesmay be ordered based upon the most current and/or frequency ofmentioning Ms. Brunner, popularity of access to articles, mostrelationships entities connected to Ms. Brunner, or based upon otherdefinitions of topmost. The NCR also provides a user interface control302 for modifying (by filtering based upon action) the articles 303displayed. In addition, the NCR includes a recommended images area 307with links to one or more images; a recommended videos area 308 withlinks to one ore more videos; a section reserved for advertisements 306(which may also be targeted to the profile being displayed); topconnections links 304 to explore profiles of the entities most currentand relevant to the displayed profile and to filter the top articlessection 303; and an about section 305, which contains a briefdescription and fast facts regarding the named entity whose profile isbeing displayed.

FIG. 3B illustrated details of the connections map shown in FIG. 3A. Inparticular, in connections map 320, the large central circle (or node)(e.g., node 311) represents the profiled person, place, or thing. Thesmaller nodes (e.g., node 312) are its top connections. The linesbetween the nodes (e.g., line/dot 310) represent the actual connection,which may be presented, for example, when the user hovers an inputdevice over the dot on the line. When a user selects the action (e.g.,dot 310), the top articles section is updated to reflect thatconnection.

FIG. 3C illustrates the modifications to the articles 332 displayed whenthe user interface control 330 is selected to cause filtering based upona selected action. Here, the user has selected the action (i.e., verb)“governing” as reflected in field 331. As a result, the NCR displays thetop articles 332 that show the current entity “Jennifer Brunner” in agoverning relation with other entities. The top recommended imagessection 333 and videos section 334 have been updated as well. In someembodiments the user interface control 330 also includes modifiers ofthe various named entities, so that the user may follow leads and findmore information on, for example, the roles of the various entities.

The powerful NLP based search processing identifies the topmost entitiesin the relationship displayed by the articles recommended in section322. That is, these are the entities involved in a “governing”relationship with “Jennifer Brunner.” FIG. 3D lists these relatedentities in section 340, which can be selected to further filter the toparticles display. For example, when the user selects the “OberlinCollege” link 346, the filtering (an abbreviated EQL) is shown in area341, the articles are changed to reflect the selection in top articles342, and the recommended images links 343 and videos links 344 are alsoupdated. By selecting the icon 350, the user is able to navigate to theprofile page for that entity when one is available. FIG. 3E is anexample of the profile page 351 for Oberlin College displayed when theicon 350 is selected for the Oberlin College link 346. The userinterface control 352 shows the actions for “Oberlin College” that canbe selected for further filtering. The top articles 353 and images 354are updated for the entity “Oberlin College.”

FIGS. 4A-4C are example screen displays of another type of NCR widgetpresented adjacent to content. In this case, the NCR 402 is displayedbelow the news article 401. The behavior of this NCR widget is similarto that described with reference to FIGS. 1A-1E. FIG. 4A illustrateswhat the NCR looks like when it is invoked. FIG. 4B illustrates theresults when a user selects the connection node

“White House” 404 (in relation to Barack Obama 403). FIG. 4C illustratesexample results when the user selects a related named entity in NCRwidget 420. In particular, when the user selects one of the namedentities in the recommended articles, here “New York Times” link 421,the connection map 423 and the related top articles 422 are changed toreflect that entity as the focus. Other behaviors are of coursepossible.

FIGS. 6-13 are provide a variety of additional forms for the userinterfaces of example embodiments of an NCR.

FIGS. 6A-6D illustrate example screen displays for an example embodimentof an NLP-Based Content Recommender in the form of (hypertext) links tofurther information. The link can be used to navigate to theinformation, which is based upon the entities recognized in theunderlying content. For example, in FIGS. 6A and 6D, severalrecommendation user interface controls and “tips” are illustrated (andpresumed to be based upon the underlying content shown, or resultantfrom a relationship search). In particular, tip 609 displays informationrelating to Al Qaeda and tip 601 displays information relating to BarackObama. For each of these NCR tips, other forms/presentations aredisplayed beneath them.

As described above, the layout of an NCR tip or user interface control(UI control) may depend upon the information available. Generally, inthe example illustrated in FIGS. 6A-6D, the name of the entity 602(e.g., Barack Obama) is presented, followed by the entity types androles relating to the entity 603 (e.g., senator, democrat, presidentialcandidate). Then, for some tips and/or UI controls, a list of factsabout the entity 604 or 608, with or without tags, and/or an overview607 of further content is displayed. In at least some embodiments, animage 606 associated with the named entity is also displayed.Importantly, if more information (as determined by the NCR) isavailable, then a link 605 (also referred to as a hyperlink, hypertext,or other indicator) may be displayed. The link 605 may be furthernavigated by a user to display recommended content.

For example, as shown in larger images in FIGS. 6B and 6C, the link 605may be used to navigate to an NCR widget provided, for example, on adesignated website, or transparently. In FIGS. 6A and 6C, a list 660 isdisplayed of the recognized entities in an underlying text segment. Thislist 660 presents an indicator of the name of the entity, optionallyfollowed by a symbol 611 of some sort, when further content isavailable. For example, when “Barack Obama” is selected, one of the tips601 is displayed as previously described. Similarly, when the “UnitedStates of America” is selected, a UI control such as tip 620 ispresented. In addition to the (ordered) list of named entities 660, theNCR widget presents a set of actions 612, and, when an action isselected, a list of the relationships 613. In NLP terminology, selectingthe action (or verb) will generate a representation of the subjects orobjects related to the selected entity via that verb. In at least someembodiments, a list of the most relevant articles 614 to the currentlydisplayed article is also presented. This list can be implemented usingthe InFact®/Evri search technology described in detail elsewhere. Forexample, the summary sentence that is displayed for each article mayindicate where the specific relationship was found.

According to one example embodiment, to populate the fields of the tipor UI control, such as action list 612 and connections list(relationships list) 613, an IQL/EQL query may be performed against thelast “W” weeks of news content to return related information. In theillustrated case, “N” results are returned for actions performed by theentity, in this case United States of America, sorted by action (verb)frequency. The top “V” verbs are then displayed, as seen in action list612. In other embodiments, actions could be derived from an NLP-basedrelationship extraction of the context (trigger) text or a set ofdocuments related to the context text, or from other sources.

FIGS. 7A and 7B are example screen displays that illustrate use of thewidgets shown in FIGS. 6A-6D integrated into an application, such as anews content provider site. In FIGS. 7A and 7B, underlying content 700,such as a news article about Barack Obama, is presented, for example, ona web page. Either automatically, or when explicitly or implicitlyindicated by a user (depending upon the news platform implementation),an information widget such as widget 701 is displayed. This widget 701has similar fields to those described with reference to FIGS. 1B-1Dabove.

The progression from FIG. 7A to 7B shows how the illustrated NCR widgetcan be dynamically updated as information is found or computed. Forexample, the widget can populate the relationships field 704 based uponthe content shown in the most relevant articles field 710, which in turnis based upon the selected entity from entity list 702 and the selectedaction from action list 703. In at least some embodiments, the contentof these fields is periodically updated, potentially automatically (andtransparently) by rerunning the appropriate NLP queries on a periodic ordefined schedule.

FIGS. 8A and 8B illustrate example screen displays for an exampleembodiment of an NLP-Based Content Recommender in the form of graphicallinks that can be used to navigate to further information. In this userinterface paradigm, relationships are represented as connected nodes,and recommended content is used as “annotations” to the nodes and/or theconnectors. For example, in FIG. 8A, several entities 801, 802, 804, and805 are shown linked through their relationships. Entities 801 and 802are person entities; whereas entities 804 and 805 are an organizationentity and an event entity, respectively.

When the user hovers over or otherwise selects the named entity “KaelaKennelly” 801, a tip 850 is displayed with initial information similarto that described with reference to FIGS. 6A-6D. Again, part of thedisplayed tip is a link (here labeled “(read, more)” 852) to furtherinformation. When a user navigates through the link, a detailed entitypage 860 is displayed, which can be populated not just with staticinformation, but with further content accessible via an NCR widget.

As shown in larger image in FIG. 8B, the relationship of entity KaelaKennelly 801 to the ASP Women's World Tour 2006 event entity 829 isrepresented in summarized form in tip 831. When expanded by selecting a“more” graphical indicator 832, a more extended form of related contentpage 830 is displayed. The extended form 830 shows a list of categoriesof related content 834, for example news & blogs, pictures and video,and a related website. An embodiment of an NCR widget can be used topresent and drive the content and/or the links displayed in the extendedpage 830. The user can return to the summary form by selecting a “less”graphical indicator 832.

FIGS. 9, 10A, 10B, 11A-11C, and 12A-12C illustrate additionalalternatives for providing user interfaces and/or tips via an NCR widgetused to provide related or recommended content.

FIG. 9 is another example screen display of a graphical representationof connections. A graphical representation is shown of the connectionsbetween a subject entity, here “Keala Kennelly, and all of the entitiesshe interacts with. Entities having more distant connections, forexample, as determined by the frequency of the relationshipsencountered, are displayed as nodes that appear further from the nodethat represents Keala.

FIGS. 10A and 10B illustrate another interface for presenting relatedcontent to an underlying named entity, for example, one either selectedby a user directly, or perhaps even indirectly via entity recognition ofentities presented on an underlying web page. In the illustratedexample, content relating to a named entity 1001 “Arnold Schwarzenegger”is presented. Fast facts area 1004 displays a number of tidbits of quickinformation regarding the named entity 1001, which may be available asdetermined by the frequency of information gleaned during the naturallanguage based analysis of related information or other contextualinformation. Roles list 1002 contains a list of all of the roles (facetsfor or categories) found for the named entity 1001. A detailed entitydescription 1005 is shown followed by a graphical representation of hisroles, which display shows a “weighting” associated with such roles.Questions area 1006 illustrates the use of query templates andnavigation tips for finding and presenting related information withoutthe user needing to type in a query via a query language such asIQL/EQL. Related entities area 1008, also supported by comprehensive NLPbased searching and indexing, allows the user to navigate to otherrelated information.

FIGS. 11A-11C illustrate another example NCR widget that combines someof the previously described textual and graphical presentations topresent related and/or auxiliary information. For example, in FIG. 11A,the user is presented with an NCR widget 1110 displayed in theforeground of the underlying (news) content 1100. The widget presents alist 1102 with quick summaries of the most relevant similar articles tothe underlying content 1100 along with a graphical representation of the“connections” (relationships) 1107 to entities that appear in thearticle 1105 selected from the related articles list 1102. FIG. 11Bshows an alternative graphical representation of the connections 1112derived from a selected article 1111 of articles list 1102. FIG. 11C isan illustration of an image 1120 rendered in response to user selectionof image 1115 from a display of images. Text 1113 shows story highlightsfrom selected article 1111.

FIGS. 12A-12C illustrates another example NCR widget integrated into awebsite that provides links to news and blog information. Thepresentations of this NCR widget focus on timeliness and frequencyconcepts, and thus the various displays may be organized differentlythan might be presented elsewhere. For example, the article summary list1200 displayed under the “Related News and Blogs” tab may be beneficialin social networking and/or blogging venues in that they are brief, listthe source of the content, and the time when posted. In addition, inFIG. 12B, under the “Most Popular Content” tab, the entity names thatappear in the most frequent news and blog postings are displayed withgraphical indications according to their importance to and frequencyfound within the documents being searched (for example, in real time).For example some entities in list 1201 are presented in different sizefonts, different colors, etc. FIG. 12C illustrates, under the“Connections” tab, a representation of the connections (relationships)1202 that may be explored in the articles summarized in article summarylist 1200. These connection nodes are the result of relationship querieson the underlying documents summarized in article summary list 1200.

Other representations for presenting recommended content by means of anNLP-Based Content Recommenders are also contemplated. It is notable thatmany such representations hide the power of the underlying relationshipindexing and searching technology by giving the user simple navigationtools and hints for getting more information. Moreover, the informationis determined, calculated, and presented in substantially real-time ornear real-time, and may be dynamically updated periodically, or atspecified intervals, or according to different schedules.

An NCR widget may be implemented using standard programming techniquesthat leverage the capabilities of a NLP-based processing engine that canperform indexing and relationship searching. It is to be understoodthat, although the interfaces illustrated in FIGS. 1B-12 are describedas incorporating the powerful capabilities of NLP processing, lesssophisticated searching techniques can also take advantage of the userinterface designs of such widgets, tips, and user interface controls tothe extent they are able to generate a portion of the content. Forexample, using a standard keyword search that pattern matches terms,some number of the entities referred to in an underlying article may beuncovered using frequency counts; however, to the extent the text iscomplex (and, for example, contains aliases, coreferences, pronouns,ambiguous nouns, etc.) it is not possible to confidently discover andsubsequently list all of the named entities in the underlying document.To do this, the document must be “understood.” Accordingly, thesophisticated and powerful natural language technology supporting thecontent recommenders described herein, can be used to achieve farimproved results.

Also, although certain terms are used primarily herein, other termscould be used interchangeably to yield equivalent embodiments andexamples. In addition, terms may have alternate spellings which may ormay not be explicitly mentioned, and all such variations of terms areintended to be included. In addition, in the following description,numerous specific details are set forth, such as data formats and codesequences, etc., in order to provide a thorough understanding of thedescribed techniques. The embodiments described also can be practicedwithout some of the specific details described herein, or with otherspecific details, such as changes with respect to the ordering of thecode flow, different code flows, etc. Thus, the scope of the techniquesand/or functions described are not limited by the particular order,selection, or decomposition of steps described with reference to anyparticular routine.

FIG. 13 is an example block diagram of an example computing system thatmay be used to practice embodiments of a NLP-Based Content Recommender.Note that a general purpose or a special purpose computing system may beused to implement an NCR. Further, the NCR may be implemented insoftware, hardware, firmware, or in some combination to achieve thecapabilities described herein.

Computing system 1300 may comprise one or more server and/or clientcomputing systems and may span distributed locations. In addition, eachblock shown may represent one or more such blocks as appropriate to aspecific embodiment or may be combined with other blocks. Moreover, thevarious blocks of the NCR 1310 may physically reside on one or moremachines, which use standard (e.g., TCP/IP) or proprietary interprocesscommunication mechanisms to communicate with each other.

In the embodiment shown, computer system 1300 comprises a computermemory (“memory”) 1301, a display 1302, one or more Central ProcessingUnits (“CPU”) 1303, Input/Output devices 1304 (e.g., keyboard, mouse,CRT or LCD display, etc.), other computer-readable media 1305, andnetwork connections 1306. The NCR 1310 is shown residing in memory 1301.In other embodiments, some portion of the contents, some of, or all ofthe components of the NCR 1310 may be stored on and/or transmitted overthe other computer-readable media 1305. The components of the NCR 1310preferably execute on one or more CPUs 1303 and perform entityidentification and present content recommendations, as described herein.Other code or programs 1330 and potentially other data repositories,such as data repository 1320, also reside in the memory 1301, andpreferably execute on one or more CPUs 1303. Of note, one or more of thecomponents in FIG. 13 may not be present in any particularimplementation. For example, some embodiments embedded in other softwaremay not provide means for other user input or display.

In one embodiment, the NCR 1310 includes an entity identification engine1311, a knowledge analysis engine 1312, an NCR user interface supportmodule 1313, an NLP parsing engine or preprocessor 1314, an NCR API1317, a data repository (or interface thereto) for storing document NLPdata 1316, and a knowledge data repository 1315, for example, anontology index, for storing information from a multitude of internaland/or external sources. In at least some embodiments, one or more ofthe NLP parsing engine/preprocessor 1314, the entity identificationengine 1311, and the knowledge analysis engine 1312 are providedexternal to the NCR and are available, potentially, over one or morenetworks 1380. Other and or different modules may be implemented. Inaddition, the NCR 1310 may interact via a network 1380 with applicationsor client code 1355 that uses results computed by the NCR 1310, one ormore client computing systems 1360, and/or one or more third-partyinformation provider systems 1365, such as purveyors of information usedin knowledge data repository 1315. Also, of note, the knowledge data1315 and the document data 1316 may be provided external to the NCR aswell, for example, and be accessible over one or more networks 1380 tothe NCR.

In an example embodiment, components/modules of the NCR 1310 areimplemented using standard programming techniques. However, a range ofprogramming languages known in the art may be employed for implementingsuch example embodiments, including representative implementations ofvarious programming language paradigms, including but not limited to,object-oriented (e.g.,

Java, C++, C#, Smalltalk), functional (e.g., ML, Lisp, Scheme, etc.),procedural (e.g., C, Pascal, Ada, Modula, etc.), scripting (e.g., Perl,Ruby, Python, JavaScript, VBScript, etc.), declarative (e.g., SQL,Prolog, etc.), etc.

The embodiments described use well-known or proprietary synchronous orasynchronous client-sever computing techniques. However, the variouscomponents may be implemented using more monolithic programmingtechniques as well, for example, as an executable running on a singleCPU computer system, or alternately decomposed using a variety ofstructuring techniques known in the art, including but not limited to,multiprogramming, multithreading, client-server, or peer-to-peer,running on one or more computer systems each having one or more CPUs.

Some embodiments are illustrated as executing concurrently andasynchronously and communicating using message passing techniques.Equivalent synchronous embodiments are also supported by an NCRimplementation.

In addition, programming interfaces to the data stored as part of theNCR 1310 (e.g., in the data repositories 1315 and 1316) can be madeavailable by standard means such as through C, C++, C#, and Java APIs;libraries for accessing files, databases, or other data repositories;through scripting languages such as XML; or through Web servers, FTPservers, or other types of servers providing access to stored data. Thedata repositories 1315 and 1316 may be implemented as one or moredatabase systems, file systems, or any other method known in the art forstoring such information, or any combination of the above, includingimplementation using distributed computing techniques.

Also, the example NCR 1310 may be implemented in a distributedenvironment comprising multiple, even heterogeneous, computer systemsand networks. For example, in one embodiment, the modules 1311-1314, and1317, and the data repositories 1315 and1316 are all located inphysically different computer systems. In another embodiment, variousmodules of the NCR 1310 are hosted each on a separate server machine andmay be remotely located from the tables which are stored in the datarepositories 1315 and 1316. Also, one or more of the modules maythemselves be distributed, pooled or otherwise grouped, such as for loadbalancing, reliability or security reasons. Different configurations andlocations of programs and data are contemplated for use with techniquesof described herein. A variety of distributed computing techniques areappropriate for implementing the components of the illustratedembodiments in a distributed manner including but not limited to TCP/IPsockets, RPC, RMI, HTTP, Web Services (XML-RPC, JAX-RPC, SOAP, etc.).Other variations are possible. Also, other functionality could beprovided by each component/module, or existing functionality could bedistributed amongst the components/modules in different ways, yet stillachieve the functions of an NCR.

Furthermore, in some embodiments, some or all of the components of theNCR may be implemented or provided in other manners, such as at leastpartially in firmware and/or hardware, including, but not limited to,one or more application-specific integrated circuits (ASICs), standardintegrated circuits, controllers (e.g., by executing appropriateinstructions, and including microcontrollers and/or embeddedcontrollers), field-programmable gate arrays (FPGAs), complexprogrammable logic devices (CPLDs), etc. Some or all of the systemcomponents and/or data structures may also be stored as contents (e.g.,as executable or other machine-readable software instructions orstructured data) on a computer-readable medium (e.g., as a hard disk; amemory; a computer network or cellular wireless network or other datatransmission medium; or a portable media article to be read by anappropriate drive or via an appropriate connection, such as a DVD orflash memory device) so as to enable or configure the computer-readablemedium and/or one or more associated computing systems or devices toexecute or otherwise use or provide the contents to perform at leastsome of the described techniques. Some or all of the system componentsand data structures may also be transmitted as contents of generateddata signals (e.g., by being encoded as part of a carrier wave orotherwise included as part of an analog or digital propagated signal) ona variety of computer-readable transmission mediums, includingwireless-based and wired/cable-based mediums, and may take a variety offorms (e.g., as part of a single or multiplexed analog signal, or asmultiple discrete digital packets or frames). Such computer programproducts may also take other forms in other embodiments. Accordingly,embodiments of the present disclosure may be practiced with othercomputer system configurations.

All of the above U.S. patents, U.S. patent application publications,U.S. patent applications, foreign patents, foreign patent applicationsand non-patent publications referred to in this specification and/orlisted in the Application Data Sheet, including but not limited to U.S.Provisional Patent Application No. 60/999,559, entitled “NLP-BASEDCONTENT RECOMMENDER,” filed Oct. 17, 2007, and U.S. application Ser. No.12/288,347, entitled NLP-BASED CONTENT RECOMMENDER,” filed Oct. 16,2008, are incorporated herein by reference, in their entireties.

From the foregoing it will be appreciated that, although specificembodiments have been described herein for purposes of illustration,various modifications may be made without deviating from the spirit andscope of this disclosure. For example, the methods, techniques, andsystems for entity recognition and disambiguation are applicable toother architectures other than a Web-based architecture. For example,other systems that are programmed to perform natural language processingcan be employed. Also, the methods, techniques, and systems discussedherein are applicable to differing query languages, protocols,communication media (optical, wireless, cable, etc.) and devices (suchas wireless handsets, electronic organizers, personal digitalassistants, portable email machines, game machines, pagers, navigationdevices such as GPS receivers, etc.).

1-13. (canceled)
 14. A computer-implemented NLP-based contentrecommendation system, comprising: a memory; and a content recommendermodule, stored in the memory, and having instructions that areconfigured, when executed by a computer processor, to: receive a textsegment for processing; identify one or more named entities to which thereceived text segment refers based, at least in part, upon a naturallanguage processing (NLP) parsing and linguistic analysis of the textsegment; derive related content based at least in part upon a naturallanguage processing parsing and linguistic analysis of entity basedinformation of the identified one or more named entities and based uponcontext information associated with the named entities or from thereceived text segment, wherein the related content includes at least onenamed entity that is connected to at least one of the one or more namedentities; and cause the derived related content to be presented.
 15. Thesystem of claim 14, wherein the module is further configured, whenexecuted, to display one or more indicators for navigating to therelated content.
 16. The system of claim 15 wherein the module isfurther configured, when executed, to present the related content inresponse to detecting selection of at least one of the navigationindicators.
 17. The system of claim 15 wherein the indicators are atleast one of links, graphical symbols, icons, shapes, logos, ortrademarks.
 18. (canceled)
 19. The system of claim 15 wherein thenatural language processing parsing and linguistic analysis initiatedusing a natural language query.
 20. The system of claim 19 wherein thenatural language query is a relationship search query.
 21. The system ofclaim 14 wherein the content recommender module is embedded into thirdparty software instructions as a code module separate from the thirdparty software instructions.
 22. The system of claim 14 wherein thecontent recommender module is embedded into a browser page, is installedas a plug-in module, or is installed as a pop-up window.
 23. The systemof claim 14 wherein the content recommender module is displayed adjacentcontent controlled by the third party.
 24. The system of claim 14wherein the content recommender module has associated representationsthat are displayed on a display screen and selectable to invoke thefunctionality of the content recommender module.
 25. The system of claim24 wherein the associated representations are at least one of icons,images, or graphical symbols.
 26. The system of claim 14 wherein thecontent recommender module is customized by presentation of a differentuser interface, color scheme, or capability.
 27. The system of claim 26wherein the content recommender module is customized based upon thecontext within which it is integrated.
 28. The system of claim 14wherein the content recommender is invoked using a scripting language.29. A non-transitory computer-readable medium containing content that,when executed, causes a computing system to perform a method comprising:receive a text segment for processing; identify one or more namedentities to which the received text segment refers based, at least inpart, upon a natural language processing (NLP) parsing and linguisticanalysis of the text segment; derive related content based at least inpart upon a natural language processing parsing and linguistic analysisof entity based information and based upon context informationassociated with the named entities or from the received text segment,wherein the related content includes at least one named entity that isconnected to at least one of the one or more named entities; and causethe derived related content to be presented.
 30. The non-transitorycomputer-readable medium of claim 29 wherein the method furthercomprises causing display of one or more indicators for navigating tothe related content.
 31. The non-transitory computer-readable medium ofclaim 30 wherein the method further comprises causing the relatedcontent to be presented in response to detecting selection of at leastone of the navigation indicators.
 32. The non-transitorycomputer-readable medium of claim 31 wherein the indicators are links.33. A method in a computer system for providing additional contentcomprising: receiving a text segment for processing; identifying one ormore named entities to which the received text segment refers based, atleast in part, upon a natural language processing (NLP) parsing andlinguistic analysis of the text segment; deriving related content basedat least in part upon a natural language processing parsing andlinguistic analysis of entity based information and based upon contextinformation associated with the named entities or from the receivedtext, wherein the related content includes at least one named entitythat is connected to at least one of the one or more named entities; andreturning the related content.
 34. The method of claim 33, furthercomprising: providing a script that defines a user interface widget thatis configured, when executed, to send a request for related content;enabling the provided script to be embedded in a Web page; andresponsive to a request to provide the Web page, serving the Web pagewith the embedded script such that, when the embedded script isexecuted, the user interface widget is presented to provide the relatedcontent.
 35. The method of claim 33 wherein the acts are providedresponsive to a request from a user interface widget executing in aclient application.
 35. The method of claim 33 wherein the methodfurther comprises causing display of one or more indicators fornavigating to the related content.
 36. The method of claim 35 whereinthe method further comprises causing the related content to be presentedin response to detecting selection of at least one of the navigationindicators.